Goto

Collaborating Authors

 ipython notebook


Steps toward MLOps research -- Software Engineering your AI

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. MLOps is an old requirement for a new field; the Machine Learning world is evolving, from a niche topic in the darkest backgrounds to a first-class citizen in a wide range of use cases.


Gradio: Hassle-Free Sharing and Testing of ML Models in the Wild

arXiv.org Machine Learning

Accessibility is a major challenge of machine learning (ML). Typical ML models are built by specialists and require specialized hardware/software as well as ML experience to validate. This makes it challenging for non-technical collaborators and endpoint users (e.g. physicians) to easily provide feedback on model development and to gain trust in ML. The accessibility challenge also makes collaboration more difficult and limits the ML researcher's exposure to realistic data and scenarios that occur in the wild. To improve accessibility and facilitate collaboration, we developed an open-source Python package, Gradio, which allows researchers to rapidly generate a visual interface for their ML models. Gradio makes accessing any ML model as easy as sharing a URL. Our development of Gradio is informed by interviews with a number of machine learning researchers who participate in interdisciplinary collaborations. Their feedback identified that Gradio should support a variety of interfaces and frameworks, allow for easy sharing of the interface, allow for input manipulation and interactive inference by the domain expert, as well as allow embedding the interface in iPython notebooks. We developed these features and carried out a case study to understand Gradio's usefulness and usability in the setting of a machine learning collaboration between a researcher and a cardiologist.


Complete Guide to Parameter Tuning in Gradient Boosting (GBM) in Python

@machinelearnbot

If you have been using GBM as a'black box' till now, may be it's time for you to open it and see, how it actually works! This article is inspired by Owen Zhang's (Chief Product Officer at DataRobot and Kaggle Rank 3) approach shared at NYC Data Science Academy. He delivered a 2 hours talk and I intend to condense it and present the most precious nuggets here. Boosting algorithms play a crucial role in dealing with bias variance trade-off. Unlike bagging algorithms, which only controls for high variance in a model, boosting controls both the aspects (bias & variance), and is considered to be more effective.


Bayesian Methods for Hackers

@machinelearnbot

Of course as an introductory book, we can only leave it at that: an introductory book. For the mathematically trained, they may cure the curiosity this text generates with other texts designed with mathematical analysis in mind. For the enthusiast with less mathematical-background, or one who is not interested in the mathematics but simply the practice of Bayesian methods, this text should be sufficient and entertaining. The choice of PyMC as the probabilistic programming language is two-fold. As of this writing, there is currently no central resource for examples and explanations in the PyMC universe.


marcotcr/lime

#artificialintelligence

This project is about explaining what machine learning classifiers (or models) are doing. At the moment, we support explaining individual predictions for text classifiers or classifiers that act on tables (numpy arrays of numerical or categorical data), with a package called lime (short for local interpretable model-agnostic explanations). Lime is based on the work presented in this paper. Our plan is to add more packages that help users understand and interact meaningfully with machine learning. Lime is able to explain any black box text classifier, with two or more classes.


machine-learning-in-a-year-cdb0b0ebd29c

@machinelearnbot

My interest in ml stems back to 2014 when I started reading articles about it on Hacker News. I simply found the idea of teaching machines stuff by looking at data appealing. At the time I wasn't even a professional developer, but a hobby coder who'd done a couple of small projects. So I began watching the first few chapters of Udacity's Supervised Learning course, while also reading all articles I came across on the subject. This gave me a little bit of conceptual understanding, though no practical skills.


Ahem Detector with Deep Learning

@machinelearnbot

Francesco is Data Scientist at Janssen Pharmaceutical Companies of Johnson & Johnson and a Science writer. He is committed to "A World Without Disease" paradigm shift in healthcare, leveraging Artificial Intelligence and Data Science to predict risk and intercepting diseases. He is focused on putting machine learning at the service of human beings. Do you know why you can't hear the ugly ahem sounds on the podcast Data Science at Home? Let me introduce the ahem detector, a deep convolutional neural network that is trained on transformed audio signals to recognize "ahem" sounds. The network has been trained to detect such signals on the episodes of Data Science at Home, the podcast about data science at worldofpiggy.com/podcast.


jupyter/jupyter

@machinelearnbot

Recitations from Tel-Aviv University introductory course to computer science, assembled as IPython notebooks by Yoav Ram. Exploratory Computing with Python, a set of 15 Notebooks that cover exploratory computing, data analysis, and visualization. No prior programming knowledge required. Each Notebook includes a number of exercises (with answers) that should take less than 4 hours to complete. Developed by Mark Bakker for undergraduate engineering students at the Delft University of Technology.


Levvel Blog - Machine Learning Part Two--Running a Machine Learning Data Store on Redis Labs

#artificialintelligence

Editor's note: This is the second post in a two-part series about machine learning. In part one, we discussed how to get started with machine learning: define, benchmark, and deploy. Managing large, pre-trained predictive models across an organization and ensuring the same version is on production can be a challenge with the rapid pace of changes in the AI/machine learning space. Here, we have an approach that demonstrates how to automate building, storing, and deploying predictive models from a Remote Machine Learning Data Store hosted on Redis Labs. This approach is focused on showing how DevOps CI/CD artifact pipelines can be used to build and manage machine learning model artifacts with Jupyter IPython notebooks, accompanying command line automation versions, and administration tools to help manage artifacts across a team.


Levvel Blog - Machine Learning Part Two--Running a Machine Learning Data Store on Redis Labs

#artificialintelligence

Editor's note: This is the second post in a two-part series about machine learning. In part one, we discussed how to get started with machine learning: define, benchmark, and deploy. Managing large, pre-trained predictive models across an organization and ensuring the same version is on production can be a challenge with the rapid pace of changes in the AI/machine learning space. Here, we have an approach that demonstrates how to automate building, storing, and deploying predictive models from a Remote Machine Learning Data Store hosted on Redis Labs. This approach is focused on showing how DevOps CI/CD artifact pipelines can be used to build and manage machine learning model artifacts with Jupyter IPython notebooks, accompanying command line automation versions, and administration tools to help manage artifacts across a team.